Multi-task Deep Neural Networks in Automated Protein Function Prediction

نویسندگان

Ahmet Sureyya Rifaioglu

Tunca Dougan

Maria Jesus Martin

Rengul Cetin-Atalay

Mehmet Volkan Atalay

چکیده

Background: In recent years, deep learning algorithms have outperformed the state-of-the art methods in several areas such as computer vision, speech recognition thanks to the efficient methods for training and for preventing overfitting, advancement in computer hardware and the availability of vast amount data. The high performance of multi-task deep neural networks in drug discovery has attracted the attention to deep learning algorithms in the bioinformatics area. Protein function prediction is a crucial research area where more accurate prediction methods are still needed. Here, we proposed a hierarchical multi-task deep neural network architecture based on Gene Ontology (GO) terms as a solution to the protein function prediction problem and investigated various aspects of the proposed architecture by performing several experiments. ∗A short version of this manuscript was accepted for oral presentation at ISMB/ECCB 2017 Function-COSI meeting †[email protected] ‡[email protected] §[email protected] ¶[email protected] ‖[email protected] 1 ar X iv :1 70 5. 04 80 2v 2 [ qbi o. Q M ] 2 8 M ay 2 01 7 Results: First, we showed that there is a positive correlation between the performance of the system and the size of training datasets. Second, we investigated whether the level of GO terms on the GO hierarchy is related to their performance. We showed that there is no relation between the depth of GO terms on the GO hierarchy (i.e. general/specific) and their performance. In addition, we included all annotations to the training of a set of GO terms to investigate whether including noisy data to the training datasets change the performance of the system. The results showed that including less reliable annotations in training of deep neural networks increased the performance of the low performed GO terms, significantly. Finally, we evaluated the performance of the system using hierarchical evaluation method. Mathews correlation coefficients was calculated as 0.75, 0.49 and 0.63 for molecular function, biological process and cellular component categories, respectively. Conclusions: We showed that deep learning algorithms have a great potential in protein function prediction area. We plan to further improve the DEEPred by including other types of annotations from various biological data sources. Finally, we plan to construct DEEPred as an open access online tool.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Multi-Step-Ahead Prediction of Stock Price Using a New Architecture of Neural Networks

Modelling and forecasting Stock market is a challenging task for economists and engineers since it has a dynamic structure and nonlinear characteristic. This nonlinearity affects the efficiency of the price characteristics. Using an Artificial Neural Network (ANN) is a proper way to model this nonlinearity and it has been used successfully in one-step-ahead and multi-step-ahead prediction of di...

متن کامل

TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions

Although deep learning approaches have had tremendous success in image, video and audio processing, computer vision, and speech recognition, their applications to three-dimensional (3D) biomolecular structural data sets have been hindered by the geometric and biological complexity. To address this problem we introduce the element-specific persistent homology (ESPH) method. ESPH represents 3D co...

متن کامل

A multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images

The reconstruction of the information contaminated by cloud and cloud shadow is an important step in pre-processing of high-resolution satellite images. The cloud and cloud shadow automatic segmentation could be the first step in the process of reconstructing the information contaminated by cloud and cloud shadow. This stage is a remarkable challenge due to the relatively inefficient performanc...

متن کامل

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

DNA sequence, containing all genetic traits is not a functional entity. Instead, it transfers to protein sequences by transcription and translation processes. This protein sequence takes on a 3D structure later, which is a functional unit and can manage biological interactions using the information encoded in DNA. Every life process one can figure is undertaken by proteins with specific functio...

متن کامل

Comparison of Artificial Neural Networks and Cox Regression Models in Prediction of Kidney Transplant Survival

Cox regression model serves as a statistical method for analyzing the survival data, which requires some options such as hazard proportionality. In recent decades, artificial neural network model has been increasingly applied to predict survival data. This research was conducted to compare Cox regression and artificial neural network models in prediction of kidney transplant survival. The prese...

متن کامل

ذخیره در منابع من

ذخیره در منابع من قبلا به منابع من ذحیره شده

{@ msg_add @}

با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره شماره

صفحات -

تاریخ انتشار 2017

Multi-task Deep Neural Networks in Automated Protein Function Prediction

نویسندگان

چکیده

منابع مشابه

Multi-Step-Ahead Prediction of Stock Price Using a New Architecture of Neural Networks

TopologyNet: Topology based deep convolutional and multi-task neural networks for biomolecular property predictions

A multi-scale convolutional neural network for automatic cloud and cloud shadow detection from Gaofen-1 images

Protein Secondary Structure Prediction: a Literature Review with Focus on Machine Learning Approaches

Comparison of Artificial Neural Networks and Cox Regression Models in Prediction of Kidney Transplant Survival

عنوان ژورنال:

اشتراک گذاری